Skip to content

ENH: disallow numpy generics #145

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 23, 2025
Merged

ENH: disallow numpy generics #145

merged 2 commits into from
Apr 23, 2025

Conversation

crusaderky
Copy link
Contributor

@crusaderky crusaderky commented Apr 2, 2025

In #135, binary operators vs. numpy.int64 became (correctly) disallowed.
Add a test for it.
Also test binary operators vs. numpy.float64 and numpy.complex128, which must be allowed as rejected even if they are subclasses of float and complex respectively.

The same treatment for functions is in a separate PR: #148.

)
def test_binary_operators_vs_numpy_float(op):
"""
np.float64 is a subclass of float and must be allowed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to admit I'm not a fan of this. This is technically correct, it is a subclass. That said, it is an implementation detail IMO.

I'd think that all instances where numpy returns a numpy scalar, all other libraries return 0D arrays, so I'd think we better always handle them together with arrays not python scalars.

Copy link
Contributor Author

@crusaderky crusaderky Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that

x = xp.asarray(1, dtype=xp.float32)
y = np.float64(2)
z = xp.maximum(x, y)

maximum should crash instead of returning an Array[float32]?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely not crash :-).

I think the following is correct[ because np.float64(2) is conceptually a np.asarray(2.) not python scalar 2.)

In [4]: xp = array_namespace(np.empty(2))

In [5]: xp.maximum(xp.asarray(1, dtype=xp.float32), 2)
Out[5]: np.float32(2.0)

In [6]: xp.maximum(xp.asarray(1, dtype=xp.float32), np.float64(2))
Out[6]: np.float64(2.0)

Likewise, the following is also correct, for the same reason:

In [12]: xpt = array_namespace(torch.empty(2))

In [13]: xpt.maximum(xpt.asarray(1, dtype=xpt.float32), np.float64(2))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[13], line 1
----> 1 xpt.maximum(xpt.asarray(1, dtype=xpt.float32), np.float64(2))

File ~/repos/array-api-compat/array_api_compat/torch/_aliases.py:91, in _two_arg.<locals>._f(x1, x2, **kwargs)
     88 @_wraps(f)
     89 def _f(x1, x2, /, **kwargs):
     90     x1, x2 = _fix_promotion(x1, x2)
---> 91     return f(x1, x2, **kwargs)

TypeError: maximum(): argument 'other' (position 2) must be Tensor, not numpy.float64

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In [6]: xp.maximum(xp.asarray(1, dtype=xp.float32), np.float64(2))
Out[6]: np.float64(2.0)

This looks definitely wrong to me. array-api-strict shouldn't quietly accept a numpy array.
Right now the output is the expected one without explicitly disallowing inheritance:

>>> xp.maximum(xp.asarray(1, dtype=xp.float32), np.float64(2))
Array(2., dtype=array_api_strict.float32)
>>> xp.maximum(xp.asarray(1, dtype=xp.float32), np.asarray(2, dtype=np.float64))
TypeError: Only real numeric dtypes are allowed in maximum(...). Got array_api_strict.float32 and float64.
>>> xp.maximum(xp.asarray(1, dtype=xp.int32), np.int64(2))
TypeError: Only real numeric dtypes are allowed in maximum(...). Got array_api_strict.int32 and int64.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the message of the TypeError is a bit misleading; it should call out explicitly something like "only Array and python scalars are accepted; got numpy.ndarray

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks definitely wrong to me. array-api-strict shouldn't quietly accept a numpy array.
Right now the output is the expected one without explicitly disallowing inheritance:

Yes, in my example xp is array_api_compat.numpy. As your example shows, array-api-strict does the right thing in main, too. So what is this PR fixing?

Copy link
Member

@ev-br ev-br left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this PR does two things:

  • adds a test for binary dunders reject arrays on different devices
  • adds tests which codify the behavior on binary dunders with numpy scalars.

The former is great IMO, the latter is less so. Essentially, I'd consider it a bug that np.float32(1.) behaves differently from np.float64(1.). Either they both should duck-type as the built-in float or they both should duck-type as a 0D array of the corresponding dtype.

To tell which behavior is better, consider how a numpy scalar might come from to array-api-strict. I see two possible scenarios:

  • a user explicitly creates an np.int64(python_int) or np.float32(python_float). Old code does that sometimes. IMO the user code should decide if they need a builtin object or a 0D array, and change accordingly.
  • some numpy operation (indexing or a ufunc, for instance) creates a numpy scalar. In those cases any other backend would have created a 0D array. Therefore, array-api-strict can have two possible behaviors: reject all numpy scalars or convert them into 0D arrays.

Rejecting numpy scalars is probably not very useful: it would require that all user code explicitly calls asarray just make user tests pass on both numpy and array-api-strict.
Converting them all sounds more useful for downstream testing, which is the main purpose of array-api-strict.

Therefore, I suggest that instead of codifying the existing bug, we add a test or several that all instances of np.generic duck-type with arrays in both binary functions and binary dunders, and actually make it happen: I think it will require something like if isinstance(x, np.generic): x = xp.asarray(x) before checking isinstance(x, bool | int| float | complex).

else:
raise TypeError(f"Expected Array | python scalar; got {type(other)}")
elif not isinstance(other, bool | int | float | complex):
raise TypeError(f"Expected Array or Python scalar; got {type(other)}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOte to self: the changes to _check_device are just cosmetics plus a rename, the functionality is the same.

"__rshift__": "integer",
"__sub__": "numeric",
"__truediv__": "floating-point",
"__xor__": "integer or boolean",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: the values here and in unary_op_types below correspond to dtype categories from _dtypes.py, so these two dicts are similar to those in https://github.com/data-apis/array-api-strict/blob/main/array_api_strict/tests/test_elementwise_functions.py#L36

a = asarray(1)
i64 = np.int64(1)
with pytest.raises(TypeError, match="Expected Array or Python scalar"):
getattr(a, op)(i64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: this test basically tests the status quo, the behavior is the same on main.

for op in binary_op_dtypes:
assert isinstance(func(f64), Array)
with pytest.raises(TypeError, match="Expected Array or Python scalar"):
func(f32)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test also reflects the status quo: the behavior is exactly the same on main.

a = asarray(1, dtype=dtype, device=CPU_DEVICE)
b = asarray(1, dtype=dtype, device=Device("device1"))
with pytest.raises(ValueError, match="different devices"):
getattr(a, op)(b)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test would be great to extend to binary functions, too. I'd expect them to work just the same, because they reuse the ._promote_scalar logic. If xp.add(a, b) differs from a.__add__(b), it's a bug, so would be great to test for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out extending to binary functions caused the scope blow up, so I moved it to a separate PR: #148

@crusaderky
Copy link
Contributor Author

I've reworked this PR to disallow np.float64 and np.complex128 from being treated as float and complex in Array methods.
The same treatment for functions is in #148.
This is ready for a second round of review.

@crusaderky crusaderky changed the title TST: test binary operators vs. numpy generics ENH: disallow numpy generics Apr 23, 2025
Copy link
Member

@ev-br ev-br left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now.

There is a non-zero chance that this causes issues downstream. However I believe that the change is correct and useful and it's downstream that should adjust to better align with the spec.
If push comes to shove, we'll temporarily revert. So let's do EAFP not LBYL and land this. Thanks @crusaderky

@ev-br ev-br merged commit 17c7c40 into data-apis:main Apr 23, 2025
19 checks passed
@ev-br ev-br added this to the 2.4 milestone Apr 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants